104 research outputs found
Kyoto: An Integrated System for Specific Domain WSD
This document describes the preliminary release of the integrated Kyoto system for specific domain WSD. The system uses concept miners (Tybots) to extract domain-related terms and produces a domain-related thesaurus, followed by knowledge-based WSD based on wordnet graphs (UKB). The resulting system can be applied to any language with a lexical knowledge base, and is based on publicly available software and resources. Our participation in Semeval task #17 focused on producing running systems for all languages in the task, and we attained good results in all except Chinese. Due to the pressure of the time-constraints in the competition, the system is still under development, and we expect results to improve in the near future
Emotional Sentence Annotation Helps Predict Fiction Genre
Fiction, a prime form of entertainment, has evolved into multiple genres which one can broadly attribute to different forms of stories. In this paper, we examine the hypothesis that works of fiction can be characterised by the emotions they portray. To investigate this hypothesis, we use the work of fictions in the Project Gutenberg and we attribute basic emotional content to each individual sentence using Ekman’s model. A time-smoothed version of the emotional content for each basic emotion is used to train extremely randomized trees. We show through 10-fold Cross-Validation that the emotional content of each work of fiction can help identify each genre with significantly higher probability than random. We also show that the most important differentiator between genre novels is fear
SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods
In the last few years thousands of scientific papers have investigated
sentiment analysis, several startups that measure opinions on real data have
emerged and a number of innovative products related to this theme have been
developed. There are multiple methods for measuring sentiments, including
lexical-based and supervised machine learning methods. Despite the vast
interest on the theme and wide popularity of some methods, it is unclear which
one is better for identifying the polarity (i.e., positive or negative) of a
message. Accordingly, there is a strong need to conduct a thorough
apple-to-apple comparison of sentiment analysis methods, \textit{as they are
used in practice}, across multiple datasets originated from different data
sources. Such a comparison is key for understanding the potential limitations,
advantages, and disadvantages of popular methods. This article aims at filling
this gap by presenting a benchmark comparison of twenty-four popular sentiment
analysis methods (which we call the state-of-the-practice methods). Our
evaluation is based on a benchmark of eighteen labeled datasets, covering
messages posted on social networks, movie and product reviews, as well as
opinions and comments in news articles. Our results highlight the extent to
which the prediction performance of these methods varies considerably across
datasets. Aiming at boosting the development of this research area, we open the
methods' codes and datasets used in this article, deploying them in a benchmark
system, which provides an open API for accessing and comparing sentence-level
sentiment analysis methods
The act of creating humorous acronyms
Our species cannot survive without humor and future human-machine interaction systems will be required to handle humor. From a practical point of view, humor is an important resource for getting selective attention, help in memorizing names and situations, etc. Even if deep modeling of humor in all of its facest is not something available in the near future, there is something concrete that has been achieved and that can help in providing attention to the field. The paper refers to the results of HAHAcronym, a project devoted to humorous acronym production, a circumscribed task that nonetheless requires various generic components. The project opens the way to developments for creative language. Electronic commerce, for instance, will include flexible and individual-oriented humorous promotion more or less as it happens in the world of broadcasted advertisemen
Making Computers Laugh: Investigations in Automatic Humor Recognition
Humor is one of the most interesting and puzzling aspects of human behavior. Despite the attention it has received in fields such as philosophy, linguistics, and psychology, there have been only few attempts to create computational models for humor recognition or generation. In this paper, we bring empirical evidence that computational approaches can be successfully applied to the task of humor recognition. Through experiments performed on very large data sets, we show that automatic classification techniques can be effectively used to distinguish between humorous and non-humorous texts, with significant improvements observed over apriori known baselines
Exploring the Lexical Semantics of Dialogue Acts
People proceed in their conversations through a series of dialogue
acts to yield some specific communicative intention. In this
paper, we study the task of automatic labeling dialogues with the
proper dialogue acts, relying on empirical methods and simply
exploiting lexical semantics of the utterances. In particular, we
present some experiments in both a supervised and an unsupervised
framework on an English and an Italian corpus of dialogue
transcriptions. In the experiments we consider the settings of
dealing with or without additional information from the dialogue
structure. The evaluation displays good results,
regardless of the used language. We
conclude the paper exploring the relation between the
communicative goal of an utterance and its affective content
Bringing the Text to Life Automatically
Animated text is an appealing field of creative graphical design. Manually designed text animation is largely employed in advertisement, movie titles and web pages. In this paper we propose to link, through state of the art NLP techniques, the affective content detection of a piece of text to the animation of the words in the text itself. This methodology allows us to automatically generate affective text animation and opens some new perspectives for advertisements, internet applications and intelligent interfaces
- …